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Abstract 

We study the effect of fluid flow on three-dimensional (3D) dendrite growth using a phase- 
field model on an adaptive finite element grid. In order to simulate 3D fiuid fiow, we use 
an averaging method for the flow problem coupled to the phase-field method and the Semi- 
Implicit Approximated Projection Method (SIAPM). We describe a parallel implementation for 
the algorithm, using Charm-|--|- FEM framework, and demonstrate its efficiency. We introduce 
an improved method for extracting dendrite tip position and tip radius, facilitating accurate 
comparison to theory. We benchmark our results for two-dimensional (2D) dendrite growth with 
solvability theory and previous results, finding them to be in good agreement. The physics of 
dendritic growth with fluid flow in three dimensions is very different from that in two dimensions, 
and we discuss the origin of this behavior. 

1 Introduction 

Dendrites are the basic microstructural form for most crystalline materials. They express the un- 
derlying crystalline symmetry, as well as the growth conditions which existed when the dendrite was 
formed. Dendrites may form from the vapor phase (e.g., snowflakes), from solution (e.g., polymer 
crystals), or by solidification from the melt (e.g., most metals). In this work, we focus our attention 
on growth from the melt, which is important in many materials processing applications. Dendritic 
growth produces local compositional variations which determine the macroscopic properties of the 
material. These features persist through subsequent processing, and it is therefore important to 
understand the mechanisms by which the microstructural pattern is selected. 

Beginning with the morphological stability theory of Mullins and Sekerka Q , the dynamics of 
pattern selection are now reasonably well understood. For a review of the theory, see Langer et 
al. and Kessler et al. 1^ There is now a consensus that the so-called "microscopic solvability" 
theory |Q,|5[ agrees very well with numerical calculations. |^,^ These comparisons were performed in 
two dimensions (2D), where accurate time-dependent simulations of dendrite growth are tractable 
using phase-field methods, and also level set techniques |^ In this paper, we focus on the phase-field 
method; for a description of the level set method as applied to solidification problems, see Chen et 
al. @ 

It is known that fiuid fiow during solidification dramatically alters the solidification structure. 



[10 1 Using typical values for the local flow velocity, material properties and process parameters, one 
can anticipate that the interdendritic flow is dominated by viscous forces (Re ~ 0(0.1)), but that 
the diffusion fields for temperature and solute are dominated by advective effects (Pe ~ 0(10—100)). 
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The presence of the flow admits the possibihty of instabihties due to the flow itself, in addition to 
the morphological instabilities normally found in crystal growth. 

The effect of fluid flow on dendritic growth is the object of this research. This is an inherently 
three-dimensional phenomenon, as can be seen in the schematic drawing in Figure |l[ A pair of 
dendrites is shown growing into a flow which is nominally perpendicular to their primary growth 
direction. The mechanism by which the flow alters the growth pattern is the transport of solute 
from the leading edge to the trailing edge of the dendrite. In 2D, this occurs by the flow going 
up and over the dendrite. However in three dimensions (3D), it is much easier for this transport 
to take place by having the flow go around the dendrite. Thus, in order to correctly model this 
phenomenon, we must do 3D simulations. 




2D 3D 



Figure 1: Schematic drawing of the flow over dendrites growing perpendicular to a superimposed 
flow, comparing 2D and 3D phenomena. 

An example of a dendrite computed with fluid flow present is given in Figure |2[ This calculation 
was done using the methods we describe later in this paper, but we introduce it here to provide a 
context for describing the physical problem. The shape of the dendrite is complex, and it evolves 
during the computation. The far-field flow on the left-hand side of the dendrite is a uniform velocity, 
directed parallel to a preferred crystalline growth direction. The figure shows streamtraces for the 
flow over the dendrite. Notice that growth is enhanced in the directions counter to the flow. 
Sidebranches also appear preferentially on the leading edge of the transverse arms, and the trailing 
arm is completely suppressed. More will be said about this in Section ^. 

The surface of the dendrite represents the interface between the solid and liquid, and there are 
two boundary conditions which must be satisfied on this interface. First there is the condition of 
local thermodynamic equilibrium 

T = Tm- r{n)K - p(n)V • n (1) 

where T is the interface temperature, is the equilibrium melting point of the pure material (with 
a flat interface), F is the ratio of the surface energy to the entropy of fusion, k is the Gaussian 
curvature of the surface, /3 is a kinetic coefficient, V is the interface velocity and n is the normal 
vector to the interface. The dependence of F and (3 on n introduces crystalline anisotropy into the 
problem. There is also an energy balance for the motion of the interface 

ksVTs-n-keVTrn = psLfV-n (2) 

where the subscripts s and £ refer to the solid and liquid phases respectively, k is the thermal 
conductivity, ps is the solid density and Lf is the latent heat of fusion. We refer to the solidification 
problem where these boundary conditions are explicitly satisfied as the sharp interface problem. 

One of the computational issues in this problem is that the position of the interface is a priori 
unknown, and therefore enforcement of the boundary conditions is difficult. Rather than track the 
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Figure 2: Computed streamtraces for flow over a growing isolated dendrite. 



phase boundary explicitly, we introduce a continuous order parameter (j) £ [—1,1], where (j) = —1 
corresponds to the liquid, cp = 1 corresponds to the solid, and the level set (j) = is identified 
as the interface. Details of the method, and the selection of parameters to ensure convergence 
of the phase-field model to the sharp interface problem described above, can be found in the 
references. [|6, 12| 

The phase-field method introduces a finite width Wq for the interface, which must be kept small 
if the calculations are to be meaningful. In particular, we require that Wq must be of the order of 
the capillary length do, which is a material property, typically ranging from 1 x 10~^ to 1 x 10~^ m. 
At the same time, the computations must resolve the diffusion field surrounding the dendrite, and 
this can be of order 1 x 10~^ m for physically relevant growth conditions. Finally, the grid spacing 
at the interface must be on the order of Wq to preserve contact with the sharp interface model. 
Thus, the spatial grid must resolve at least five orders of magnitude. Uniform grid approaches are 
clearly limited to two dimensions, and even then they are very computationally intensive. 

In this paper, we resolve this difficulty by solving the phase-field equations on an adaptive 
finite element mesh. The methods are discussed in greater detail in the following sections; just a 
sketch is provided here. The 3D domain is meshed with hexahedral elements, stored in an octree 
data structure. Local error estimators are used to selectively refine or coarsen the mesh, and this 
permits tracking of the interface as well as resolution of gradients in the other fields. There are 
six degrees of freedom at each node (three velocities, pressure, temperature and (j)), and a typical 
computation, such as the one shown in Figure |2| has up to 300,000 nodes, and thus well over 
one million unknowns. While adaptive grid methods make the computations feasible, the full 3D 
problem remains a formidable challenge. 
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The outline of this paper is as follows: In Section 2 we describe numerical methods. In Section 3 
we present a detailed description of the 3D adaptive grid refinement algorithm. Section 4 describes 
a parallel implementation using Charm++, and presents the results of our effort to accelerate 
our code. In section 5 we explain the difficulties generically encountered in measuring interface 
velocities, even with phase field methods, and present accurate schemes to calculate the interface 
velocity. In Section 6, we present results for 3D dendrite growth without and with fluid flow. In 
section 7, we conclude and discuss our results. 



2 Numerical methods 



The numerical implementation of the adaptive grid technique applied to the phase-field model 
without convection has been described in detail elsewhere, so we focus here on the fluid flow 



problem. Beckermann et al. [13| introduced an averaging method for the flow problem coupled 
to the phase-field, and we follow that approach here. Beckermann et al. performed only 2D 
calculations, but their methods extend naturally to 3D. The phase average of a variable ^ for 
phase k over volume AV, is defined as 



l-d> 



(3) 



where G [0, 1] is an existence function. The phase averages of the velocity and pressure are used 
for deriving the mixture continuity equation, averaged liquid momentum equation, and averaged 
energy conservation equation. The formulation ensures that the fluid velocity is extinguished in 
the solid, and further that the shear stress at the liquid-solid interface is handled correctly. 

The governing equations for simulating dendritic growth with fluid flow are the mixture conti- 
nuity equation, averaged liquid momentum equation, averaged energy conservation equation, and 
phase field equation as follows: 



The mixture continuity equation: 

where u is the velocity vector. 
The averaged momentum equation: 
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where t is time, p is pressure, u is the kinematic viscosity, 6 = W0/V2 is the characteristic 
interface width, and /i is a constant (=2.757) which ensures that the interface shear stress is 
correct for a simple shear fiow (see Beckermann et al. [p^). 

• We write the averaged energy conservation equation in terms of a dimensionless temperature 
6 = Cp{T — Tm)/Lf, scaled by the specific heat Cp and latent heat of fusion Lf. 
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where D = utq/Wq in which a is the thermal diffusivity and tq is a time characterizing 
atomic motion in the interface. 

The 3D phase-field evolution equation is given by 



A0(l-(/)2) (l-02)+V- Ty(n)2V(/> 



dJ\V(f>\'W{n 



dW{n) 
dW{n) 



(7) 



where A is a dimensionless constant that controls the tilt of the double well potential which 
forces (p to the attractors at ±1. Anisotropy is included in this equation by writing the 
interface mobility r and width W as functions of the local normal vector n. Following Karma 
and Rappel [0 , we choose 



W{n) = Woas{n); T{n) = TQal{n) 



with 



as{n) = (1 - 864) 



1 + 



1 - 3e4 |V,^|^ 
The constant parameter €4 fixes the strength of the anisotropy in the interface energy. 



(8) 
(9) 



We solve the 3D flow equations using the Semi-Implicit Approximate Projection Method (SIAPM) 
as developed by Gresho [14|. SIAPM is a predictor-corrector method which can solve Eq. (^) and 
Eq. (H) effectively, especially for large 3D problems, because it uses relatively small amounts of 
memory. The velocity degrees of freedom are solved in a segregated form, and the pressure is up- 
dated using a projection method. For a detailed discussion of the algorithm, the reader is referred 



to the original paper |14|, and we present only an operational description of the algorithm here. 
The algorithm consists of four steps: 



1. Compute an intermediate velocity -u""^ from 



—M--K + f]vJ!+^ 
At 2 



At 2 J ' 
-A(w"X - G^p" 



(10) 



where u^~^^ is the vector of nodal values of the intermediate velocity component i at timestep 
n + 1, ii" is the corresponding vector at timestep n, and p" is the vector of nodal pressures 
at timestep n. The coefficient matrices are defined in terms of the velocity shape functions 
N as follows: 



M 



K 



:i 



-N^Ndn 



1) f dN^ dN dN'^ dN^ 
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dx dx 



(11) 
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The velocity field found in the first step is generally not divergence-free. The next step 
corrects the pressure to obtain an approximately divergence-free velocity field by solving a 
Poisson equation for Ap""^^ = — p") : 



where 



I.Ap"+^ = (it"+^ - n^^ (16) 

-d£l (17) 



(1 - 4>) dN'^ dN 
2 dxk dxk 

D = / (l^JV-f^dn (18) 

ace 



is then updated from 



3. Finally, the projected velocity u"^^ is computed in a corrector step by solving 

^n+i ^ _ AtML-iGAp"+^ (20) 

where 

Ml= f ^^Ndfl (21) 
Jn 2 

The computations are started using an initial velocity field uq determined from the boundary 
conditions. In order to obtain a field uq whose discrete divergence is close to zero, we perform J 
iterations (typically 10) iterations on the following system (shown in pseudo-code): 

For I = 1 to J [ 

Lq^ = -Du^"^ (22) 
= u^-^-ML^Gq^ ] 

The variable q plays the role of a temporary pressure update, but the actual initial pressure is zero 
everywhere. 

Eqs. (TO) and (|l6|) are solved by the conjugate gradient (CG) method with diagonal precondi- 
tioning ||l^. The SIAPM can calculate the velocity field for large 3D problems much faster than 
fully implicit time-stepping methods, because convergence is reached for Eq. (|^) in a few iterations 
and the number of degree of freedom of Eq. (|l6|) is one. The CG iteration for Eq. (|l|) converges 
more slowly, typically 50-200 iterations. 

The averaged energy equation Eq. (^ is also solved using the CG method with diagonal pre- 
conditioning. Streamline upwind schemes [|^] are employed for the convection terms in Eq. ( p!o|) 
and Eq. (^). The 3D phase field equation is a nonlinear system. In order to solve the system 
implicitly, an iterative method such as the Newton-Raphson method is required. We use instead an 
explicit time-stepping scheme where a linear system is solved. Stable solutions are obtained from 
the explicit scheme, because the variation of the (f) field exists only in the interface region and a 
sufficiently small time increment At is used. 
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3 3D adaptive grid refinement algorithm 



To resolve the interface, the grid spacing Ax must be smaller than the characteristic interface width 
Wo, which must in turn be on the order of the capillary length, for the solution of the phase field 
model to converge to the sharp interface limit. On the other hand, the system size L required 
for simulation is determined by the size of the diffusion field ahead of the dendrite. A ratio of 
L/Ax ~ 10^ — 10^ is typical. For simulation of a single 3D dendrite, this implies that at least 10 
million elements must be used in a uniform grid. Such a simulation of 3D dendrite growth with 
fluid flow would be intractable. 

In this problem, there is an important characteristic that the various fields vary most rapidly in 
the interface region, whose width is much smaller than L. For this reason, adaptive grid refinement 
techniques can be applied very effectively. The 3D adaptive grid refinement is described in the next 
few sections. 



3.1 Error estimating procedure 

The basis of the code is the element data structure, illustrated in Figure ^. The structure consists 
of arrays for element connectivity, neighbors and also the element pressure. 



type element 
integer 
integer 
integer 

type (connectivity) , pointer 
type (connectivity) , pointer 
integer 

type (neighbor_eleinents) , pointer 

integer 

integer 

integer 

real*8 

integer 

integer 



real*8 

type (element) , pointer 
type (element) , pointer 
end type element 



nujn_element 

level 

Ineigh 

connect 

connect_mid 

midindex(6) 

neighbor 

num_parent (LimitLevel) 

num_history (LimitLevel) 

merge 

error 

nver 

ntype 



pe 

previous 
next 



Element number 
Refinement level 
Number of neighbor elements 
Pointer of connectivity 

Pointer of connectivity for disconnected nodes 
Index to check for discontinuous nodes 
Pointer to neighbor elements 
Parent element numbers 
Time step number 

Index to check if the element should be merged 
Error estimator value at time step n 
Vertex node number for disconnected edge node 
Element type (liquid/solid/interf ace) 



If phi = -1 

If phi = 1 

If -1 < phi < 1 

Element pressure 

Previous element 



ntype 
ntype 
ntype 



l(liquid) 
2(solid) 
3(interf ace) 



in linked list 



Next element in linked list 



Figure 3: Element linked data structure for adaptive grid 

The grid is locally adapted based on an element-by-element error estimate, with a hybrid scheme 
using the magnitude of (p and the interelement variation of the derivatives of 6. We use (j) &s an 
indicator to define the specific region in which the finest elements should be distributed. Specially, 
if an element includes a node where 

(t>min < < (t>max (23) 

then the element is divided until its refinement level becomes the maximum level. We control the 
width of the region with the finest elements through the values of 4)min and (t)max ■ We proceed by 
defining a grid, and then solving a predetermined number of time steps on that grid N^ef (typically 
A^re/=20-100), and then adapt the grid. We require that the interface remains within the fine grid 
during the time steps. We found that 0mm = —0.99 and (t)max = 0.9 gave consistent results. The 
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reason for the asymmetry is that the interface moves from the sohd region > to the hquid 
region (/> < 0. Outside of the interface region {(p < cpmin or (p > (pmax), we used an error estimator 
based on the magnitudes of the derivatives of 9 as follows: 




ve ■ VOdn 



(24) 



where Qe is the element. The estimated error Esub of a subelement divided from a parent element 
with error Ep can be calculated from the asymptotic rate of convergence of the finite element 
approximation as 



where k is the exponent for the asymptotic rate of convergence {k = 2 for linear elements), and 
hsub and hparent are the element sizes of the subelement and parent element, respectively. A limit 
value Eiimit is calculated from 



where 7 is a scale coefficient (typically 10), Nt is the total number of elements, N^ax is maximum 
refinement level, and A'^e is the refinement level of element e. The element is divided until its 
estimated error becomes smaller than a specified limit value. Once the elements to be refined have 
been selected, the grid is refined using the procedure described in the next section. 

3.2 Grid refinement procedure 

The grid refinement procedure begins by storing the pointers of neighboring elements and the 
refinement level of each element. Refinement is required in an element whenever the estimated 
error exceeds the limit value, or when the absolute difference between the level of an element and 
that of its neighbors exceeds one. The latter is called the single-level rule. Refinement is carried 
out successively at each refinement level, beginning with the minimum, according to the procedure 
illustrated in Figure ^ and described below. 

• After checking the computed elemental errors, a set {Eq} is created which consists of elements 
in the current refinement level, whose error exceeds the present limit value. These elements 
will be refined after the neighboring elements which must be refined to satisfy the single-level 
rule are found and refined. 

• A recursive search is then performed to find the outermost elements which need to be refined 
to satisfy the single-level rule. We do this by first defining a new set {Ef} whose initial 
value is {Eq}. The neighbor elements to each element in {Ef} are then examined, and any 
neighboring element that does not satisfy the single- level rule is added to a new set {Efaf}. 
If {Efaf} is not a null set then {Ef} is replaced by {Efaf}. This procedure is repeated until 
{Efaf} becomes a null set. At this point, the set {Ef} contains the outermost elements which 
need to be refined to satisfy the single-level rule, and these elements are then divided into 
eight subelements by bisection of each face. (See Figure ^.) At this point, the set {Ef} is 
again set equal to {Eq}, and the search begins anew. This process is repeated until the search 
for elements to fill set {Efaf} yields a null set. Then, the elements in set {Eq} are refined, 
and the refinement process is completed in the next step. 
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Figure 4: Dividing sequence for refinement. The arrows indicate the sequence of refinement, as 
discussed in the text. 



• The nodal coordinates, element connectivities, neighbor arrays, parent arrays, nodal refine- 
ment levels, etc. are updated. The neighbor array is the array to store the element numbers 
of neighboring elements. The parent array is the array to store the element number of the 
parent elements for the coarsening procedure, described in the next section. 



The recursive search for elements violating the single-level rule in the second step above may 
seem inefficient, because it performs multiple searches, finding the same elements. However, this 
procedure makes it possible to perform the searches with each element knowing only about its 
nearest neighbors. This gives greater efficiency in the computational phase, because it limits the 
memory allocation required within the element data structure. 



3.3 Grid unrefinement procedure 

The unrefinement procedure is accomplished through a loop in which the refinement level is de- 
creased incrementally from the maximum level to the minimum level. 

• A set {{Egf}} is created, containing all elements which are eligible for unrefinement based 
on the value of the error estimator. Each element of {{Egf}} is a subset {Egf}, consisting of 
eight elements to be merged into one. If an element whose refinement level equals the current 
level of the unrefinement loop has smaller estimated error than a prescribed limit value, then 
the element is placed in a temporary set {Efemp}- The elements belongs to {Efemp} are sorted 
into the subsets {Etsf} according to their parent element. 

• If the number of elements belonging to any subset {Efsf} is eight and an element created 
from the eight elements satisfies the single level rule, then {Etsf} becomes the subset {Egf}. 
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• The eight elements belonging to each {£'«/} are then merged into the parent element. As 
we did for grid refinement, the nodal coordinates, elemental connectivities, neighbor array, 
parent array, and nodal refinement levels are updated. 

3.4 Treatment of disconnected nodes 

The adapted grid will have so-called disconnected nodes which appear whenever an element has 
a neighbor whose refinement level differ by one, as illustrated in Figure |5[ An element with 
connectivity [6, 3, 12, 4, 1, 2, 13, 5] contacts two neighbor elements of lower refinement level. 




Figure 5: Configurations of disconnected sides and nodes 



To ensure continuity of the solution between elements, the following constraints are enforced 
for each degree of freedom, here represented by the symbol Vi, 



Disconnected edge mid-nodes: 



Vi + V7 Vi + Vii Vi + Vg 
^2 = , ^5 = 7, ' ■"6 = 7, 



(27) 



Disconnected face mid- nodes: 



Vi + Vr + Vs + Vg f 1 -I- 1^11 -I- ViQ + Vg 

^3 = : , = 



(28) 



After applying the constraints, the original element connectivity is modified to be [8, 12, 10, 9, 
7, 13, 11, 1], and the element shape functions are changed appropriately. This change is made to 
facilitate domain decomposition in the parallel implementation, described next. 



4 Parallelization by CharmH — h 

Charm-|— |- is a message-passing parallel runtime system for machines from clusters of workstations 
to tightly-coupled symmetric multi-processing machines. A parallel FEM code can be writ- 
ten in FORTRAN90 using interface routines from the Charm-|— |- FEM framework. |18| The FEM 



framework program consists of three kinds of subroutines. Service routines such as "INIT" and 
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"FINALIZE" do I/O, startup and shutdown tasks, and are called only on the first processor. The 
main work is done using "DRIVER" routines, replicated on every processor. In our parallel imple- 
mentation, the adapted grid is newly regenerated every 20-100 time steps. After each regeneration, 
the adapted grid is partitioned into chunks assigned to each processor using METIS. [19| This 
function is also handled through the interface to Charm-|— |-. 

In the DRIVER routine on each processor, the temperature and velocity fields are calculated 
using the preconditioned CG method in an the element-by-element scheme, and the (j) field is solved 
using an explicit time stepping scheme, as described earlier. All calculations for the CG method are 
accomplished through the products of the elemental stiffness matrixes and local solution vectors. 
This creates additive contributions for the residual vector for each degree of freedom. Some "shared 
nodes" appear on more than one processor. The array is calculated for each chunk, and the 
values of the field variables for all shared nodes are combined with the other chunks by using the 
Charm-I— I- function "FEM_Update_Field" . For calculating the inner product of arrays and finding 
the maximum error values, each nodal value is combined across all chunks and the shared nodes ares 
not double-counted by using the Charm-|— |- function "FEM_Reduce_Field" . After the computations 
in the DRIVER routine on all processors are completed, the calculated data are transfered from 
DRIVER to INIT. The grid is adapted, then repartitioned. These procedures are repeated until 
the simulation is finished. 

In order to rate the parallel performance of the code, we compute the ratio SP, defined as 

run time on one processor 

SP = v-^^j 

run time on n processors 

An example result is shown in Figure ^, where we used a grid with 296,636 elements and 349,704 
nodes computing over 20 time steps. The value SP=28.8 for 32 processors is typical for our code, 
and shows that the code has been effectively parallelized. 




Number of processors 



Figure 6: Speed-up for Charm-|— |- FEM framework 



5 Tip Velocity Measurement 

The steady state tip velocity is a convenient measure to compare the computational results to 
dendritic growth theories. In the phase-field method, tip velocity is inferred from the temporal 
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evolution of (j) = along the primary growth axis. Karma and Rappel were able to derive 
interface velocities from the numerical solution by using a third-order polynomial to interpolate (p, 
using neighbor and next-neighbor grid point values. Using a 1-D analog problem as a basis, they 
showed that this scheme gives an interface velocity with a small amplitude oscillation about the 
mean, due to the interpolation of the moving front on the fixed grid. 

This multipoint scheme is somewhat problematic when adaptive gridding is used, because the 
selection of points for interpolation is incompatible with our data structure. We developed a new 
scheme using a hyperbolic tangent function to perform the interpolation. This approach provides 
equivalent control of the tip position oscillation, yet it requires only two grid points, corresponding 
to the nodes in the element which contains the interface {(j) = 0). Let us denote the x-coordinates 
and (j) values of these two points as (xi, c^i) and (x2, (p2)- We interpolate (j) along the axis as 



cPix; Xc, W) = -tanh (^-j^ j (30) 

where Xc and W are fitted parameters corresponding to the zero-crossing of (p and the interface 
width, respectively. The parameters Xc and W are determined as follows: 

W = , . 

ln[(l-,/.i)(l + 02)]-ln[(l + 0i)(l-</)2)] ^ ' 

Xc = ^ In — + xi (32) 

2 1 - 01 

The interface velocity is then computed by a finite difference in time between successive interface 
locations. 

To demonstrate the scheme, we examined the same ID test problem considered by Karma and 
Rappel Q, viz. 

= ^°S + ^-^' + ^ 
using tq=1, Wo=1, Ax=0.8 and A=0.02. This problem has a traveling wave solution which prop- 
agates at velocity V=0.041. The computed interface velocity is shown in Figure |^ for a variety 
of interpolation schemes. It can be seen that the hyperbolic tangent interpolation scheme has a 
small oscillation amplitude, slightly smaller than that of the third-order polynomial. The average 
interface velocity can be easily extracted by selecting pairs of interpolated points separated by the 
amount of time it takes to cross one grid spacing. We refer to this as the moving average tip 
velocity. This method is used to compute the interface velocities reported for the calculations in 
the remainder of this paper. 



6 Results 

6.1 2D Verification Problem 



In order to validate our 3D code, we simulated the 2D example analyzed by Beckermann et al. |13|. 
The computational domain is a square with an initial circular seed. A 3D domain is created by 
extending the square domain and the circle seed into the y-direction. 

We used a box of edge length L = 204.8 so that subelements of Ax = 0.8 or 0.4 are created 
through repeated bisection of the domain. The fiow enters at the left edge, with Ux = 1 and Uy = 0. 
The top and bottom surfaces are symmetry boundaries. Beckermann et al. used L = 230.4, and 
similar values for Ax. The problem parameters are: undercooling A=0.55, thermal diffusivity D=4, 
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Figure 7: Interface velocity versus time for 1-D test problem 



and anisotropy €4=0.05. We also used At=0.016, W=l, to=1, and A=6.383. The capillary length 
do{=0'i^) is 0.139, the kinematic viscosity u is 92.4 and the Prandtl number Pr calculated from 
these parameters is 23.1. These physical parameters correspond to succinonitrile. The adapted grid 
is newly regenerated every 20 time steps. We examined two minimum grid spacings, Axmin=0.8 
and A3;min=0.4. The largest element size was Axmax = 3.2 for both cases. 

For 2D dendrite growth without fluid flow, when using Ax=0.8 and Ax=0.4, the steady state 
dendrite tip velocities Vup scaled with D/do are 0.01744 and 0.01689, respectively. Those values 
are in good agreement with solvability theory (0.01700) and previous phase-field results. [|l3| We 
calculated the tip radius pup using the method in Reference |^, where the tip curvature is computed 
using estimates of the second derivatives at the tip interpolated along the two coordinate directions 
at the tip. We obtained ptip/do=lQ.58 and 8.78 for Ax=0.8 and Ax=0.4, respectively. The ratio 
Ptip/do for Ax — > can be computed by Richardson extrapolation, plotting pup versus Ax^. The 
extrapolated value of this ratio for Ax ^ is 8.20, which is somewhat larger than the solvability 
solution (6.90). 

An alternative method to fit the tip radius, given by Wheeler [^], was also used. In this 
method, we fit to an effective Ivantsov solution by calculating the parabolic tip radius p^j^p from the 
slope of z versus x^ in the region behind the tip where the curve becomes straight. The tip radius 
obtained this way is pf^^p = 3.84 and the tip Peclet number 



2D 



0.237 



(34) 



The relative difference between the Pe^ and PgiVAN (^o.257) [^] obtained from the Ivantsov relation 
is less than 8% percent. Thus, using p^-^ does indeed remove uncertainties from using the value at 
just one point. 

Figures ^ and ^ and Table || summarize the results for 2D simulations. For the upstream tips, the 



steady tip velocity for Ax=0.8 and Ax=0.4 agree with the results of Beckermann et al. |13] within 
9% percent. For Ax=0.4, the upstream tip grows approximately 55% faster than in the growth 
without flow at t=100, while the downstream tip grows approximately 45% slower. Beckermann 
et al. reported that the upstream tip grows 40% faster and the downstream tip grows more than 
30% slower. The trend for changing the tip velocities according to the growing directions agrees 
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Figure 8: Tip velocities and tip radii in 2D 



well with the results by Beckermann et al. For the upstream tips, the ratios pup/Pup with and 
without fluid flow are 1.09 and 1.12 for Aa;=0.8 and 0.4, respectively. Those values also agree well 
with the result (1.11) obtained by Beckermann et al. The ratio CTq/ct*, where the subscript 
refers to the the solution in the absence of fluid flow and a* = 2doD / ptip'^Vup, are 1.95 and 0.63 for 
the upstream tip and downstream tip, respectively. These results agree reasonably well with those 
of Beckermann et al. 

A detailed comparison is not meaningful because our simulations have not reached steady state, 
and in some cases are not fully grid converged, as shown in Table [l|. Figure |^ shows interfaces of 
the dendrites with and without fluid flow in 2D. The interfaces are in good agreement with previous 
results. 



6.2 3D computations 

We simulated 3D dendrite growth with fluid flow using a cube computation domain with a spherical 
seed, as shown in Figure The inlet velocity boundary conditions are imposed on the left side 
boundary and their values are Ux=^, %=0, and Uz=0- For this particular problem, we again have 
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Figure 9: Computed interfaces in 2D witliout and with fluid flow at t=16, 72, and 104 



Table 1: Results for the simulations of dendrite growth in 2D and 3D 





Solvability solution(2D) 


2D(Ax=0.8) 


2D(Ax=0.4) 


3D(Ax=0.8) 


Vtip{no flow) 


0.489 


0.502 


0.486 


0.915 


Vtif (flow) 




0.766 


0.761 


1.038 






0.324 


0.276 


0.800 






0.533 


0.516 


0.915 


Ptip{no flow) 


0.959 


1.47 


1.22 


2.90 


upstream / n \ 

Ptip (flow) 




1.60 


1.36 


3.50 


^downstream (a A 

Ptip (now) 




1.50 


1.28 


2.41 



U = 1. The largest Ax is 6.4, while the smallest Ax is 0.8. Time increments of At=0.016 and 
At=0.008 are used to model growth without and with fluid flow, respectively. The other parameters 



are identical to the parameters for the 2D analysis in previous section. Figure 11 shows several of 
the grid configurations in the analysis. The scale factor 7 used in the adaptive grid procedure was 
10 for this simulation. 

The simulations using Ax=0.8 and Ax=0.4 consumed about 40 hours and 190 hours, respec- 
tively, on a single processor of the IBM RS/6000 machine with clock rate of 200MHz. In the 
simulation where Ax = 0.8, the initial mesh consisted of 10,607 elements, and the final mesh had 
208,357 elements. These numbers should be compared with a fully dense mesh, which would have 
had over 4 x 10^ elements. With fluid flow, computing the velocity and pressure fields consumes 
approximately 80-90% of the total CPU time. We performed some the simulations using a different 
time increment for velocity field Aty, larger than the time increment for temperature and (p fields 
Atg^. For Ax=0.8 and Atv=5Atg^, the CPU time decreased by 76%, and the results for pup and 
Vtip agree within 5% with the results obtained by using Atv=Ate(j,. The detailed results presented 
next were computed using Atv=Atg^=0.008 in order to compare our results with the previous 
2D results. This run took approximately 250 hours on 16 processors of the ORIGIN2000 at the 
National Center for Supercomputing Applications (NCSA). The same run using Atg0=O.O16 and 
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Figure 10: Schematic diagram of 3D computation domain and boundary conditions 



Atv=5AtQ^ consumed approximately one eighth of the run time for Atv'=Ate0 =0.008, and the 
deviation in the results was less than 3%. Thus, a run time of about 32 hours on 16 processors 
yielded essentially the same result. 

Without fluid flow, the steady scaled tip velocity Vupdo/D is 0.0318 and the ratio Vup /V^fp 
for Ax=0.8 is 0.55. Karma and Rappel Q reported that this ratio is 0.39 for A=0.65 and effective 
anisotropy ee=0.0269. The pup/do is 20.86 and the ratio Pti^/Pti^ for Ax=0.8 is 0.51. The ratio 

2Ddo/plpVup. 



a. 



is 7.09, where a* is the selection constant a* 



The time evolution of the dendrite tip velocities and tip radii are shown in Figure 12. With fluid 



flow, the upstream tip grows 13% faster than it does without flow, while the downstream tip grows 
13% slower. We stopped the calculations at t=86.5, before the diffusion field encounters the end of 
the computational domain. These results seem to show that the effect of 3D flow on the upstream 
dendrite tip velocity is much weaker than the effect of 2D flow. However, the reason for the trend 
is that the effect of forced flow on tip velocity is highly dependent on the ratio U/Vup where U is 
the inlet velocity. The ratio U /V^J^ is 45% smaller than the ratio f7/V^|f , and this accounts for the 
reduced effect. The ratio pup/ Pup is 1-21 for 3D for the upstream tip, which is slightly larger than 
the ratio pup/ p%p=l.09 for 2D. The tip radius for the downstream in 3D decreases by 17% due to 
the flow effect, while the tip radius in 2D increases by 2%. Thus, the effect of the forced flow on 
the downstream tip is much stronger in 3D than in 2D. The cause of the trend is that in 2D the 
fluid flows vertically over the dendrite, while in 3D the fluid flows both vertically and laterally over 
the dendrite. 

Figure ^ shows several interfaces in the x-z plane. The tilt angle of the transverse arms into 
the flow in 3D is 2.5°, while the tilt angle in 2D is 2.3°. From Figure [l^ we can see that sidebranches 
begin to appear on the upstream side of the transverse tip more quickly than on the other sides. 



Bouissou et al. |21] showed that when surface tension effects are neglected, the Peclet number 
Pev computed using Vup is related to the Peclet number computed using the inlet velocity Pejj for 
a given undercooling. In order to assess the effect of the forced flow on the tip Peclet number. We 
computed the the difference between the tip Peclet number with fluid flow Pey = p^ /2D and 
the tip Peclet number without fluid flow Pcy = p^ /2D , divided by the Peclet number related 
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Figure 12: Tip velocities and tip radii in 3D 



to the inlet velocity Pe(j = Up^ /2D 



APe 



Pe 



V 



Pe'y 



Pe 



(35) 



We found that for the upstream tip in 2D, PefP is 0.305, and it is similar to Pe^ (=0.275), even 
though the effect of the forced flow on the tip velocities and tip radii in 2D and 3D is much different. 

We found that the ratio (T^/a* for 3D is 1.675 for upstream tip and 0.603 for the downstream 
tip. With flow present, it takes somewhat longer for a* to settle onto a steady value, as seen in 
Figure ||. The trend in a* for the upstream tip with fluid flow agrees with experiments by Bouissou 
et al. pit] on alloys of PVA and ethanol, while the trend is opposite to the experiments by Lee et 



al. [22 1 on SCN forced past a needle. Further work is needed to examine this phenomenon. 
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Figure 13: Computed interfaces in 3D without and with fluid flow at t=9.6, 43.2, and 86.4 



7 Conclusions 

In this paper, we presented an efficient algorithm using 3D adaptive grid refinement in order to 
study the effect of fluid flow on 3D dendrite growth using a phase-field model. We simulated 
3D dendrite growth with fluid flow using adapted grids with at most 300,000 elements, while over 
4,000,000 elements would be required for an uniform grid. We also present a parallel implementation 
by using Charm-|— |- FEM framework. A speed-up factor SP=28.8 for 32 processors is typical for 
the larger meshes. 

We also introduce a new scheme using a hyperbolic tangent function to interpolate tip position. 
We found that the scheme, which requires just two points for interpolation, has a smaller oscillation 
amplitude than third-order polynomial interpolation using 4 points. 

Test cases showed that the dendrite tip velocities and radii for 2D dendrite growth were in 



satisfactory agreement with solvability theory and previous computational results |13|. We found 
that the effect of fluid flow on dendrite tip velocity in 2D is much larger than in 3D, because the 
ratio U/V^p is approximately two times larger than the ratio U/V^p. For the downstream tip, 
the tip radius in 3D decreases by 17%, and the tip radius in 2D increases by 2%. The cause of the 
trend is that in 3D the fluid flows both vertically and laterally over the dendrite and the effect of 
the forced flow in 3D is much stronger than in 2D where the fluid flows vertically over the dendrite. 

We examined the effect of the forced flow on the tip Peclet number by computing APe: the 
difference between the tip Peclet number in the growth with fluid flow and the tip Peclet number 
in the growth without the fluid flow, divided by the Peclet number related to the forced flow. We 
found that APe in 2D and in 3D are within 10%, even though the tip radii and velocities vary by 
much more than that. In 3D growth with fluid flow, a* for the upstream tip decreases by 39%. 
The trend for a* for the upstream tip agrees with experiments (reduction of 37%) by Bouissou et 
al. 1 21 1 on alloys of PVA and ethanol. The ratio Gq/g*^^ for the upstream tip is 14% smaller than 
the ratio (T^ct*^^. 
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Figure 14: (T*s in 3D 



Acknowledgments 

This work has been supported by the NASA Microgravity Research Program, under Grant NAG8- 
1249, and the National Science Foundation, under grant NSF DMS 98-73945. The authors wish to 
thank S. Kale, M. Bhandarkar, O. Lawlor and T. Hinrichs for their help with the parallelization of 
the code. We also thank C. Beckermann and A. Karma for helpful discussions and sharing of their 
results. 

References 

[1] W. W. Mullins and R. F. Sekerka. J. Appl. Physics 35, 444 (1964). 

[2] J. Langer. "Lectures in the Theory of Pattern Formation", in Chance and Matter, Les Houches 
Session XLVI, edited by J. Souletie, J. Vannenimus and R. Stora, North Holland, Amsterdam 
629-711 (1987). 

[3] D. A. Kcsslcr, J. Koplik and H. Lcvinc. Adv. Phys. 37, 255 (1988). 

[4] E. Ben-Jacob, N. Goldenfeld, B. Kothar and J. Langer. Phys. Rev. Lett. 53, 2110 (1984). 

[5] D. Kessler, J. Koplik and H. Levine. Phys. Rev. A 30, 3161 (1984). 

[6] A. Karma and W.-J. Rappel. Phys. Rev. E. 53, 3017 (1995). 

[7] N. Provatas, N. Goldenfeld and J. Dantzig. Phys. Rev. Lett. 80, 3308 (1998). 

[8] Y.-T. Kim, N. Goldenfeld and J. Dantzig. Phys. Rev. B 62, 2471 (2000). 

[9] S. Chen, B. Merriman, S. Osher and P. Smereka. J. Comp. Phys. 135, 8 (1997). 
[10] S. H. Davis. Theory of Solidification (Cambridge University Press, 2001). 
[11] J. A. Dantzig and L. S. Chao. In Proc. 10th U.S. Nat. Cong, of Appl. Mech., edited by J. Lamb, 



249 (1986). 



20 



[12] N. Provatas, N. Goldenfeld and J. Dantzig. J. Comp. Phys 148, 265 (1999). 

[13] C. Beckermann, H.-J.Diepers, I. Steinbach, A. Karma and X. Tong. J. Comp. Phys. 154, 468 
(1999). 

[14] P. M. Gresho, S. T. Chan, M. A. Christen and A. C. Hindmarsh. Int. J. Numer. methods 
fluids 21, 837 (1995). 

[15] X. Mikic and E. C. Morse. J. comp. Phys. 61, 154 (1985). 

[16] A. N. Brooks and T. J. R. Hughes. Comp. Methods Appl. Mech. Eng. 32, 199 (1982). 

[17] L. V. Kale and S. Krishnan. In Parallel Programming using C++, edited by G. V. Wilson and 
P. Lu, 175 (MIT Press, 1996). 

[18] M. Bhandarkar and L. V. Kale. In Proceedings of the International Conference on High Per- 
formance Computing (HiPC) (December 2000). 



[19] G. Karypis. http: / /www-users, cs. umn. edu/C karypis /metis / . Univ. of Minnesota (2000). 



[20] A. A. Wheeler, B. T. Murray and R. J. Schaefer. Physica D 66, 243 (1993). 
[21] P. Bouissou, B. Perrin and P. Tabeling. Phys. Rev. A 40, 509 (1989). 
[22] Y.-W. Lee, R. Ananth and W. N. Gill. J. Cryst. Growth 132, 226 (1993). 



21 



